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Section  1.  Introduction  and  Summary 

For  an  ergodic  Markov  chain  in  discrete  or  continuous  time,  we  will 
be  interested  in  situations  in  which  there  exists  an  initial  distribution 
tTq,  and  a  state  j,  such  that  n^Cj)  =  P^  (X(t)=j)  is  increasing  in  t. 
Also  of  interest  are  states,  j,  for  which  p t ( j , j )  =  Pr(X(t)=j |x(0)=j) 
is  decreasing  in  t.  We  will  examine  the  implications  of  such  monotonicity, 
as  well  as  conditions  for  it  to  hold. 

Let  r  denote  the  stationary  distribution  and  the  waiting 

time,  starting  in  steady  state,  to  reach  state  j,  with  T  .  =  0  if 


X(0)  =  j.  Similarly  define  T 


it  _ .  i 
0’ J 


In  Section  2  it  is  shown  that  if 


7  (j)  =  (X(t)=j)  is  increasing,  then: 

0 


(1.1) 


T  .  'v  Y  +  T  . 


7Tt(j) 

where  Pr(Y>  t)  =  1 - ttt,  and  Y  and  T  .  are  indeoendent.  One 

77  (j)  TT,j 

consequence  of  (1.1)  is  that  T  .  is  stochastically  larger  than  T 


V3 


another  is  that  if  is  approximately  exponential  and  EY  is  small 

compared  to  ET  then  T  .  is  also  approximately  exponential.  Approxi 

77  ’’O’J 

mate  exponentiality  is  discussed  in  Section  7. 

A  tempting  heuristic  interpretation  of  (1.1)  is  that  Y  represents 

"the  waiting  tine  from  Tq  to  steady  state."  That  is,  if  there  were  a  random 

variable  Y  with  X(Y)'vtt,  X(Y)  independent  of  Y,  and  Pr(X(t)=j,  for 

some  t  <  Y)  =  0,  then  T  .  would  equal  Y  (the  waiting  time  to  steady 

0,J 

state)  plus  T  .  (the  waiting  time  from  steady  state  to  j),  and  Y  and 
77 » J 


t 


would  be  independent.  This  suggests  that  the  distribution  of  Y 
(given  above)  is  that  of  a  strong  stationary  time,  in  the  sense  of  Aldous 
and  Diaconis  (1987).  This  interpretation  holds  precisely  in  a  large  class 
of  situations  described  in  Section  4.  However,  in  general,  tt  ( j )  increasing 
does  not  imply  that  the  distribution  of  Y  is  that  of  a  strong  stationery 


If  Pt(j,j)  Is  decreasing,  in  discrete  or  continuous  time,  for  a 
state  j,  then  T^  can  be  represented  as  a  geometric  convolution. 
Specifically: 


(1.2) 


T  .  =  l  W. 

TT  -t  4- 


where  {W  , i=l , 2, . . . }  are  i.i.d.,  N  is  independent  of  { }  with  Pr(N=k) 
(l-TT(j))k7r(j)  ,  k=0, 1, . .  . ,  and 


(1.3) 


Pr (W >  t)  = 


Pt<j.j)-TT(j) 

l-^(j) 


This  representation  has  two  uses.  Firstly,  geometric  convolutions  with 
small  p  (in  this  case  p=u(j))  are  approximately  exponential.  Error  bounds 


can  be  obtained  from  the  first  two  moments  of  T  ..  Secondly,  the  moments 

» J 

of  T  .  are  easily  related  to  those  of  W,  which  in  turn  can  be  expressed  in 
terms  of  the  eigenvalues.  Tn  the  case  of  time  reversible  chains  in  continuous 


time  this  leads  to: 


3 


(1.4) 


-t/a . 


sup  j  Pr  (T^  j  >  t)-e 


< 


x/a . 

_ J _ 

(x/a  )+l 


where  a .  =  ET  .  and  x  is  the  relaxation  time,  defined  as  A,  ,  where 
3  xr  ,  j  1 

0  =  An  >-L  >  -A„  >  •••  >  -A  are  the  eigenvalues  of  the  infinitesimal  matrix, 
u  x  z  —  m 

This  provides  a  quantification  and  generalization  of  Propositon  1  of  Aldous 
(1989),  who  showed  that  x/ct_.  was  a  key  parameter  in  approximate  exponentiality 
for  random  walks  on  vertex  transitive  graphs. 

We  now  describe  the  class  of  situations  for  tt  (j)  monotonicity  previously 
alluded  to.  Consider  ergodic  Markov  chains  in  discrete  or  continuous  time, 
taking  values  in  a  partially  ordered  set  S,  possessing  a  unique  maximum 
state  M(i_<M  for  all  ieS) .  Further,  assume  that  the  time  reversed  process 
is  stochastically  monotone  relative  to  the  partial  ordering,  and  that 
TTg(k)  h  (k)  is  decreasing  in  keS  relative  to  the  partial  ordering.  Then 
irt(M)  is  increasing  in  t,  and: 


(1.5) 


s(t) 


irt(k) 

max(l  -  — r;-;-  ■■)  = 
k  "(k) 


irt(M) 

^(mT  * 


The  quantity  s(t)  is  the  separation  between  tt  and  it,  as  studied 
by  Aldous  and  Diaconis  (1987),  and  Diaconis  and  Fill  (1989).  It  yields  an 
upper  bound  for  total  variation  distance,  which  is  the  main  quantity  of 
interest  in  the  study  of  how  rapidly  a  Markov  chain  approaches  ergodicity. 
Expression  (1.5)  tells  us  that  for  all  t,  the  separation  is  achieved  at 
state  M,  which  we  call  a  separating  state.  This  greatly  simplifies  the 
computation  of  s(t)  and  examples  are  presented  to  illustrate  this  p'int. 
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In  the  case  where  S  is  totally  ordered,  the  above  conditions  coincide 
with  those  of  Theorem  4.6  of  Diaconis  and  Fill  (1989),  under  which  a  dual 
process  with  convenient  properties  is  constructed,  and  then  employed  to  study 
s(t).  The  current  approach  offers  an  alternative  to  duality,  and  allows  for 
the  flexibility  of  choice  of  partial  ordering  of  S.  Examples  are  given  to 
show  that  for  the  same  Markov  chain,  different  partial  orderings  can  be  used 
for  different  initial  conditions,  resulting  in  a  wide  array  of  tt  (j)  monoto¬ 
nicity  and  convenient  computation  of  s(t). 

Another  class  of  processes  considered  are  discrete  time  ergodic 

Markov  chains  with  state  space  {0,...,M},  satisfying  P(i^, j^)P(i  , j£) 

_  _  m 

P^il’^2^P^i2’^l^  for  a11  il  <  *2’  ^l  <  ^2*  wliere  =  ^  P(i,k). 

For  such  a  chain,  if  Ti^(i)  /tt  (i)  is  decreasing,  then  tt^(M)  is  increasing 
in  n,  and  (1.5)  holds. 


t 
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Sect ion  2.  Deriva t ion  of  (1.1) . 


Consider , 
passage  time  to 
define  T 

V3 


first,  a  continuous  time  chain.  Let 
j  under  ^  it  with  T  .  =  0  if 

0  TT.J 

Now: 


T  .  denote  the  first 
^,3 

X(0)=j;  similarly 


(2.1) 


IT.  =  P_(X(t)=j) 
3  r 


ft 

■'0 


p(j  » j)dp^x^ 

t-x  1T,j 


where  F  .  is  the  distribution  of  T 

17  ,  3  71 , 3 

to  obtain: 


Take  Laplace  transforms  in  (2.1) 


(2.2) 


e“StdF 


(t) 

".j 


7T 


33 


where  iJj  is  the  Laplace  transform  of  Pt(j»j)* 

Assume  that  77  (j)  =  P^  (X(t)=j)  is  increasing  in  t.  Since 
C  0 

lim  7Tj(  j  )  =  ir(j  )  ,Trt  ( j  )  _<  tt  ( j  )  for  all  t,  thus  7Tt(3)/T,(j)  is  a  cdf.  Let 

Y  denote  a  random  variable  with  this  cdf,  and  let  denote  the  Laplace 

transform  of  Y,  iJj  the  Laplace  transform  of  n  (i),  and  Uj  .  the 

t  t  J  ^tt0,j 


Laplace  transform  of  T  ..  Then: 


(2.3) 


^Y(S)  =  S 


e"StPr(Y  <  t)dt  =  4't(s) 


Analogous  to  (2.1)  we  have: 


(2.4) 


77t(j) 


■t 
■  0 


(3,3> 

pt-x 


dF 

TT 


(x) 

0»3 


6 


From  (2.1),  (2.3)  and  (2. A): 


(2.5) 


(s) 

V 


^(s) 

JJ 


<~h><V-  *‘s)> 

,  (  S  )  IT  .  t 

S4< .  .  j 
11 


*'swa) 

^.1  1 


Thus  T  .  'v  Y  +  T  and  (1.1)  is  proved. 

1T»j 

In  the  discrete  time  case  Y  is  a  discrete  random  variable  with  cdf 


F(n)  = 


TT(j) 


The  above  argument  holds  using  probability  generating  functions 


in  place  of  Laplace  transforms. 


7 


Section  3.  Conditions  for  Monotonicity  of  ir  (j). 

First  we  present  some  elementary  facts  about  stochastically  monotone 
Markov  chains,  with  finite  partially  ordered  state  spaces. 

* 

Let  S  be  a  finite  set  with  partial  ordering  denoted  by  _<  .  Define 

★ 

A  to  be  an  upper  set  if  xeA  and  y  >_  x  implies  yeA.  Similarly,  define 

lower  sets.  Define  a  discrete  time  Markov  chain  with  state  space  S  and 

probability  transition  matrix  P  to  be  stochastically  monotone  relative  to 
* 

<  if: 

P(x,A)  =  l  P(x,y)  _<  P (y, A) 
yeA 

* 

for  all  x  <  y  and  upper  sets  A.  Define  a  real-valued  function  h  on  S 

*  * 

to  be  increasing  relative  to  <_  if  x  <  y  implies  h(x)  _<  h(y).  If  h  is 
increasing  then  the  elements  of  S  can  be  labeled  as  d^,...,d^,  in  such  a 
way  that  i<  j  implies  h(d^)  <  h(d^),  and  that  d^  is  not  greater  than  d^ 
under  the  partial  ordering  (Brown  and  Chaganty  (1983)  p.  1007).  If  follows 
that  Aj  =  {dj , • • • ,dn)  is  an  upper  set,  j=l,...,n.  Defining  h(d^)  = 0 
(d^  is  not  in  S)  we  have: 


Eih(X)  =  E(h(X1)|XQ=i)  =  l  P(i, j)h(j) 

j 


=  i  p(i,d  )  y  (h(dr)-h(dr_1))  =  i  (h(dr)-h(d  ))Pu,Ar)  . 
j=l  J  r=l  r=l 


If  follows  that  if  P  is  stochastically  monotone  and  h  is  increasing, 

* 

then  E^h(X)  is  increasing  (all  relative  to  <) .  Since: 
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P„+1U.A>  -  Pr<Xn+lEAlV1)  ■  Vn(X’A) 

it  follows  by  induction  that  stochastic  monotonicity  implies  that  p^(i,A) 

is  increasing  in  i,  for  all  n  and  upper  sets  A.  Similarly  E^hCX^) 

is  increasing  in  i  for  all  increasing  h,  and  all  n. 

In  the  continuous  time  case,  let  {X(t),t  0}  be  a  Markov  chain  with 

state  space  S  and  infinitesimal  matrix  Q.  Define  {X(t),t  >  0}  to  be 

* 

stochastically  monotone,  relative  to  <_,  if  both  of  the  following  hold: 

(i)  Q(x,A)  =  l  q  (x , y)  <_  Q(y,A) 

yeA 

k 

foir  all  x  _<  y,  and  upper  sets  A  not  containing  y. 

k 

(ii)  Q(x,B)  >  Q(y,B)  for  all  x  <  y  , 


and  1 


Then 


ower 

If 

P 

1) 


sets  B  not  containing  x. 

(i)  and  (ii)  hold,  choose  c  >  2max  Z  q  ,  and  define  P  =  I+c  ^ 

i  k/i  lk 

is  stochastically  monotone  as  is  seen  in  the  three  possible  cases: 
k 

If  x  <  y  and  A  is  an  upper  set  not  containing  y  then: 


Q. 


P(x,A)  =  c  1Q(x,A)  <  c  1Q(y,A)  =  P(y,A)  . 

k 

2)  If  x  <  y  and  A  is  an  upper  set  containing  x: 


P  (x ,  A) 


1-c  ^Q(x, A) 


1-c  1Q(y,A) 


P(y,A) 


where  A,  the  complement  of  A,  is  a  lower  set. 
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3 ) I f  x  <  y  and  A  is  an  upper  set  containing  y  but  not  x,  then: 


P(x,A)  =  c  1Q(x,A) 


P(y»A)  =  l-c_1Q(y,A) 


thus. 


P(y,A)-P(x,A)  =  1-c  1[Q(x,A)+Q(y,A)  ]  >  1  -  >  0  . 

(2max  I  q  )  - 

i  kH 

Using  stochastic  monotonicity  >f  P,  we  find  that  for  x  <  y,  and 
upper  sets  A: 


,n  -ct 


n  -ct 


t (x » A)  =  I  —  -  Pn(x,A)  <  l  --C~  J -  Pn(y,A)  =  pjy, 


A) 


Thus  Pi(X(t)eA)  is  increasing  in  i  for  all  t  and  upper  sets  A.  Similarly 
if  h  is  increasing  then  E^(h(X(t)))  is  increasing  in  i  for  all  t.  We 
summarize  the  above: 

Lemma  3.1.  Let  {X  , t=0, 1, . . . }  or  {X(t),t  >0}  be  a  Markov  chain  with 
finite  state  space  S,  stochastically  monotone  relative  to  _<  .  Then: 

(i)  pt(x,A)  is  increasing  in  x  for  all  t  and  upper  sets  A. 

(ii)  E^h(Xt)  is  increasing  in  i  for  all  t  and  increasing  h.  [[] 

If  {Xn>n=0,l, . . . }  is  ergodic  with  stationary  distribution  tt  ,  the 
time  reversed  chain  is  the  Markov  chain  with  transition  matrix  P(i,j)  = 
j  P(j,i).  Similarly  for  continuous  time  ergodic  chains,  the  reversed 


chain  has  infinitesimal  matrix,  Q(i,j)  =  — ^  Q(j,i) 

i 


0 


Time  reversible  chains  satisfy  P  =  P,  in  discrete  time,  and  Q  =  Q  in 
continuous  time. 

Theorem  3.2,  below,  derives  conditions  for  ti  (j)  increasing  in  t. 

Theorem  3.2.  Le  *-  {X  ,  t=0, 1, . . .  }  or  {X(t),t  >  0}  be  an  ergodic  Markov 
chain  taking  values  in  S,  where  S  is  a  finite  partially  ordered  set  with 

k 

partial  ordering  _<  .  Assume  that  S  contains  a  unique  maximal  element  M, 

*  ~ 
so  that  x  '  M  for  all  xeS.  Then  if  the  time  reversed  process,  X,  is 

stochastically  monotone,  and  Hq(x)/v(x)  is  decreasing  in  xeS  (both  with 

k 

respect  to  <_  ) ,  then  tt  (M)  is  increasing  in  t. 

Proof.  We  first  show  that  Hj.  (M) /tt  (M)  minimizes  tt  ( j ) /ir  ( j ) ,  over  jeS. 
This  follows  since: 


(3.3) 


(M) 


=  y 


yi) 


TT  (M)  L  TT  (M) 


"0(i) 

ptu'M)  '  l  "TTIT  h0*-1* 


■  bthr  <V>  ;  Ej  hr  «t» 


*t<J) 

~Hj) 


the  last  inequality  holding  since  —  is  decreasing  and  X  is  stochastically 
monotone . 

To  show  that  (3.3)  implies  that  tt  (M)  is  increasing  in  t,  consider 

the  hypothesis  testing  problem,  Hq:Xq^tt^  vs  H^:Xn'V’r,  based  on  the  data 

(x  v) 

(X  ,Xt).  The  densities  under  Hn  and  H  are  given  by  f_(x,y)  =  tt  (x)p  *' 
s  U  1  us  t-s 

(x  y) 

and  f^(x,y)  =  Ti(x)pt  ’  .  The  test  which  has  smallest  type  1  error  among 

all  tests  with  type  2  error  <  1-~(M),  rejects  Hq  if  X(s)  =  M  (Neyman-Pearson 

Lemma).  The  type  1  error  of  this  test  is  7is(M).  A  less  efficient  competing 

test  with  type  2  error  1— tt  (M) ,  rejects  Hq  if  X(t)=M,  and  has  larger  type  1 

error,  tt  (M) .  Thus  ~  (M)  <  tt  (M)  for  s  <  t.  [] 
t  St 


11 


Section  4.  Separation  Distance  and  Strong  Stationary  Times. 

Separation  distance  and  its  connection  to  strong  stationary  times  is 
an  elegant  contribution  of  Aldous  and  Diaconis  (1987),  with  important  recent 
developments  by  Diaconis  and  Fill  (1989). 

For  an  ergodic  Markov  chain  with  finite  state  space,  and  initial  distri¬ 
bution  Tig,  the  seperation  at  t  is  defined  by: 

*t(k) 

s(t)  =  max (1  -  ^ y). 

Separation  provides  an  upper  bound  for  the  total  variation  norm: 

(4.1)  d(t)  =  max  |  r  (B)-tt(B)  |  =  £  d(k)-r  (k)) 

Be  S  ^(k)>TTt(k) 

7T 

=  E^(l (X))  +  <  P^dfx)  >  TTt(X))s(t)  1  s(t)  . 

Call  the  Markov  chain  separable  with  separating  state  M,  under  tt^, 
t:  (M) 

if  s(t)  =  l-vwr  for  311  '• 

A  strong  stationary  time  is  a  stopping  time,  T,  with  X(T)  'V  tt  and 
X(T)  independent  of  T.  For  any  strong  stationary  time,  s(t)  >_  Pr(T>t) 
(Aldous  and  Diaconis  (1987)  p.  72).  When  equality  holds  for  all  t,  T  is 
called  a  minimal  strong  stationary  time.  Aldous  and  Diaconis  (1987)  construct 
a  minimal  strong  stationary  time  for  a  general  ergodic,  finite  state,  discrete 
time  Markov  chain. 

Corollary  4,1.  Under  the  conditions  of  Theorem  3.2  the  Markov  chain  is 
separable  with  separating  state  M.  Furthermore  there  exists  a  minimal  strong 
stationary  time,  Y,  satisfying: 
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TT  (H) 

(i)  Pr(Y  >  t)  =  1  -  ^  ^ -  =  s(t),  for  all  t  . 

(ii)  Pr(X(t)=M,  for  some  t  <  Y)  =  0  . 


Define  Z  =  [min{  t  _>  Y:X(t)  =  j  }-Y]  .  Then  Z 

and  distributed  as  T  Thus  T  .  =  Y+Z  with 

77, M  77  ,M 

~  (M)  U 

Pr(Y>t)  *  l-TTHT*  XU,  and  Z'T 


Proof .  By  the  proof  of  Theorem  3.2: 


is  independent  of  Y 
Y  and  Z  independent. 


rr  t  CM)  7T  t  (  j  ) 
7t(M)  —  7t(j) 


for  all 


jeS 


11 1  (M) 

thus  s(t)  =  1  -  — ^ ,  and  the  Markov  chain  is  separable  with  separating 
state  M. 

In  discrete  time,  the  Aldous-Diaconis  construction  produces  a  minimal 

strong  stationary  time,  Y,  which  by  the  details  of  their  construction 

satisfies  (ii).  Since  Y  is  a  minimal  strong  stationary  time,  and  the  process 

n^CM) 

is  separable,  Pr(Y  >  n)  =  s(n)  =  1  -  — rrp-  . 

TT  (,M.) 

For  a  continuous  time  chain,  by  choosing  c  =  2max  E  q.,  ,  we  obtain  a 

_1  1  1 

discrete  time  skeleton,  P  =  I+c  Q,  which  satisfies  the  above  conditions. 
Denoting  the  embedded  Markov  chain  by  {X^,n=0, 1, . . . } ,  we  can  represent 
{X(t),t  >  0}  by  {X(t)=X^t^,t  >_0}  where  {N(t),t_>0}  is  a  Poisson  process 
of  rate  c,  independent  of  {X^,n=0, 1 , . . . } .  Denote  the  event  epochs  from 
the  Poisson  process  as  {S^.n^l},  and  define: 

Y  =  Sy,  , 

where  Y'  is  a  minimal  strong  stationary  time  for  {X1}. 
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Clearly,  Y  is  a  strong  stationary  time  for  the  continuous  time 
process  and: 


Pr (Y  >  t) 


00  /  ,vn  -ct 

l  }  '  V -  Pr  (Y*  >  n) 

n=0 


CO 


1 

n=0 


(ct) 


t 


(1- 


TT  (H) 

TT  (M)  ; 


*t(M) 

rr(M) 


=  s(t)  . 


Thus  Y  is  a  minimal  strong  stationary  time.  Furthermore: 


Pr  (X(t)=M,  for  some  t  <  Y)  =  Pr(X'=M,  for  some  n<Y')  =  0  . 

n 


Thus  (i)  and  (ii)  hold  in  continuous  time. 

In  view  of  (ii),  T  =  Y+Z.  Define  (X* (t) , t=0, 1, . . . } ({X* (t) , t > 0} 

in  continuous  time)  by  X'c  (t)  =  X(Y+t)  .  Since  Y  is  a  strong  stationary 

time,  (X*(t)}  is  independent  of  Y.  Since  Z  is  the  first  passage  time 

to  M  for  the  X*  process,  Z  is  also  independent  of  Y.  Since 

X*(0)  =  X(Y)^tt,  it  follows  that  Z 'v  T  Thus  T  „  -  Y+Z,  with  Y 

tt.M  ttq,M 

and  Z  independent,  Pr(Y>t)  =  s(t),  and  Z  'v  T^ 

(4.2)  Remark.  Define  the  chain  distance  from  state  x  to  state  y,  d(x,y), 
to  be  the  minimal  n  such  that  Pn(x,y)  >  0.  If  ttq  =  6^  (one  point  distri¬ 
bution  at  (x})  then  a  necessary  condition  for  M  to  be  a  separating  state 
is  that  d(x,y)  _<  d(x,M)  for  all  yeS.  This  is  true  since  if  M  is  a 
separating  state  and  pk(x,M)  >  0,  then: 


]  >  s(k) 


Pk(x,M)  Pk(x,y) 

tt(M)  —  1  tt  (y) 


thus  pk(x,y)  >  0. 
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In  applications  there  is  often  only  one  potential  separating  state. 

The  problem  then  reduces  to  producing  a  partial  ordering  under  which  M  is 
the  unique  maximal  state,  x  is  a  minimal  state,  and  P  is  stochastically 
monotone.  In  section  5,  this  point  is  illustrated  by  examples.  [] 

Under  the  conditions  of  Corollary  4.1,  it  ( j ) /it ( j )  is  decreasing  in 
j  and  thus  assumes  its  minimum  at  H.  Of  course  mono  tonicity  of  it  ( j ) /it  (j ) 
is  considerably  stronger  than  7r  ( j  ) /7t ( j )  >.  it  (M)  /tt (M) .  For  a  totally  ordered 

state  space,  say  {0,...,M}  with  0<1<2<»*.<M,  a  weaker  condition  than 

_  _  _  M 

71  f.(j)/TT(j)  decreasing,  is  tt  (j)/tr(j)  decreasing,  where  tt  (j)  =  I  tt  (k) , 
t  t  t  j  t 

_  M 

tt ( j )  =  I  tt (k)  .  This  is  true  since: 

j 


(4.2) 


-  (j)  M  tt  (k)  tt (k) 

_ =  y  (_£ _ )  _ _ ) 

ir(j)  j  Tr(j) 


7T 


(X)  |X  >j] 


Since  tt^/w  is  decreasing  and  X|X_>  j  stochastically  increasing  (where 
X  ^  tt),  it  follows  that  17  (j)/r(j)  is  decreasing. 

Furthermore,  ~t(j)/T(j)  decreasing  implies  Tr  (j)/7r(j)  _>  7Tt(H)/Tr(M) 

77t(k) 

for  j  =  0, . . .  ,M.  To  see  this  define  y  .  =  mini — rr-r>k  >  j  }>  i  =  0, . . .  ,M. 

3  r(k)’  -J  J 

Note  that  by  (4.2),  for  j=0,...,M-l: 


TT(j)  tt  (3 )  3  TT  (3 )  TT  ( j+1) 


TT  (3  )  J  TT  (3  ) 


! 

J 


(4.3) 
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Since 


is  decreasing,  it  follows  from  (4.3)  that: 


"t(j) 

TT(j) 


>  EiT[-^(X)|Xij+l] 


j=o. 


,M-1  . 


Thus 

77t  (k)  t  (  j  ) 

y .  =  mini-^,  k  >  .1 }  =  min y j+1  >  «  y j+1 

for  j=0,...,M-l.  Thus: 


(4.4) 


■rr  t  (k) 

Xq  =  min{~y  ,  k  =  0, .  . .  ,M} 


Trt(M) 

yM  =  it  (M)  ' 


It  follows  from  (4.4)  that  if  for  all  t,  «  /tt  is  decreasing,  then 
the  process  is  separable  with  separating  state  M,  and: 


s(t)  =  1 


*t(M) 
*(M)  * 


We  now  need  to  find  conditions  under  which  it  /it  is  decreasing  for 
all  t. 

Define  a  discrete  time  Markov  chain  on  {0,...,M}  to  be  failure  rate 
monotone  if  i^  <  ^  i-mpli-es: 

(4.5)  P(i1,j1)P(i2,j2)  >  P(i1,j2)P(i2,j1) 

_  M 

where  P(i,j)  =  I  P(i,k).  This  condition  is  equivalent  to  i^ <  i^  implies 

k=j 
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P(i-L,j)/(P(i2»j)  decreasing  in  j,  for  j  such  that  PC^.j)  >  0. 

It  is  also  equivalent  to  P(i, j) /P(i, j)  decreasing  in  i  for  each  j, 
where  P(i, j)/P(i, j)  is  defined  to  be  1  if  P(i,j)=0.  Shantikumar  [(1988), 
Lemma  2.1,  p.  399]  proves  that  if  (4.5)  is  satisfied  for  P,  then  it  is 
also  satisfied  for  Pn  for  n  ^  2. 

Gathering  together  the  above  observations  we  derive: 


Lemma  4.2.  Let  {X  ,n=0,l,...}  be  an  ergodic  Markov  chain  with  state 


space  {0,1,..., M),  which  is  failure  rate  monotone.  Then  ^Q(j)/r(j) 

decreasing  implies  that  tt  (M)  is  increasing  in  n,  and  that 
it  (M) 


s  (n)  =  1  -  • 


tt  (M)  ’ 


for  all  n. 


Proof .  It  follows  by  the  above  remarks  that  we  just  need  prove  that 
^n(j ) /^(j )  is  decreasing  in  j,  equivalently  that  irn(j)/¥n(j)  <_ 
:T(j)/~(j)  for  all  j.  Define 


i  <1>  Pn(i’j)  .  ,  „  x  *(k) 

h  < - and  ft  (k)  «•  -  then 


n>±  p  (i» j ) 


TT(k) 


K  (3)  = 


(j) 


n  0  ^n  ’J  _n_j_i  =  ,  c(j)h(j) 

n, i  n, i 


Tr^(j)  I  7rQ(i)p^(i, j) 


n 


Q)  fTT(i)Pn(i,j)h^| 


nii=  Zd(j!h(^ 


l  ir(i)pn(i,  j) 


where  c  .  and  d  .  are  probability  distributions  on  {0,. 
ti)  i  n  j  1 

by  failure  rate  monotonicity  of  P,  h^ ^  is  decreasing  in  i. 


.  ,M).  Now 
Moreover 
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=  a(TT.(i)/Tr(i))  with  a  a  constant,  thus  d  .  is  larger 
n, 1  n, 1  U  n,  l 

than  ^  under  monotone  likelihood  ratio  ordering,  and  thus  under 

stochastic  ordering.  Thus  j )  _>h_^  (j )  so  that  ^  (j)/1T(j)  is  decreasing, 

it  (M) 

and  therefore  (by  (4. A)),  s(n)  =  1 - 7-rr— .  Since,  by  the  Aldous-Diaconis 

it(M) 

construction,  s(n)  is  decreasing,  it  follows  that  tt^CM)  is  increasing.  [] 
Example.  Consider  the  following  Markov  chain  on  {0,1,2} 


P  = 


1/4 

1/2 

1/4 

1/8 

1/8 

3/4 

1/8 

1/8 

3/4 

The  time  reversed  chain  is  given  by  : 


P  = 


1/4 

5/32 

19/32 

2/5 

1/8 

19/40 

1/19 

15/76 

3/4 

Since  P(1,0)  >  P(0,0),  P  is  not  stochastically  monotone,  and 


Corollary  4.1  does  not  apply.  However,  P  is  failure  rate  monotone 

1^(2)  = 
P  (0,2) 


(h0(0)  =  I  >  t=hl(0)  =h2(0),h0(l)  =|>y=h1(l)  =h2(l),h0(2)  =hx(2)  =h2(2)  -  1) 


We  conclude  that  Pn(0,2)  is  increasing,  and  that  Sg(n)  = 1 - n ( 2 )~ 


96,  Ln 
19  8 


n  >  1.  We  further  remark  that  — is  not  decreasing  in  j: 


"(j) 


P  (0,0) 


n  '  1+fi,in  P„(0.1) 

it  (0)  "  1  +  6(8)  <  tt  ( 1 ) 


,  .  72/lsn  .  Pn(0,2)  96  ,1. n  . 

1  5  (8}  tt  (2)  ■  1  "  19(8)  ’  n  -  X 


I  have  not  worked  on  extending  Lemma  4.2  to  continuous  time,  or  to 


partially  ordered  spaces. 
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Section  5.  Examples. 

5.1.  Non-symroetric  walk  on  the  unit  cube. 

Consider  a  discrete  time  Markov  chain,  {X  ,n=0,l,...}.  with  state 

n 

d  d 

space,  {0,1}  ,  the  2  d-tuples  of  0’s  and  l's.  The  chain  evolves  by 
changing  at  most  one  coordinate  at  a  time.  Thus  if  5.  is  a  d-vector 

fl  j=i 

with  5  .  ( j )  =  <:  then: 

1  [o  j*i 

P(x,x+6^)  =  a^(x)  for  x^ =  0 

?(x,x-6^)  =  B^(x)  for  x_^  =  1 

P(x,x)  =  1-1  a.(x)  -  l  8  (x)  . 

x^=0  x^=l 

Assume  that  ou(x),  B^(x)  are  such  that  the  chain  is  ergodic.  Let 
Wq  =  6x>  a  point  distribution  at  x.  By  Remark  (4.2),  the  only  potential 
separating  state  is  x+1,  where  (x+1) ^ = x^+l(mod  2).  This  suggests  partial 
ordering  by  chain  distance  (see  Remark  (4.2)),  i.e.  y  <  z  if  and  only 
d(x,y)  <  d(x,z).  Under  this  ordering,  x  is  the  unique  minimal  state,  and 
x+1  the  unique  maximal  state.  The  problem  then  reduces  to  finding  conditions 
for  stochastic  monotonicity  of  P. 

Consider  a  special  case  of  the  above  with  ou(x)  =  oi^,  B^(x)  =  B^, 
independent  of  x,  with: 

(i)  a  >  0,  B..  >  0,  1=1,..., d 

£  a  .  +  1  B  .  <1  for  some  AC  {1,2, . . .  ,d}  . 

A  A  1 


(ii) 
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d  d 

(iii)  max  ( £  a  .  +  max  8 .  ,  £  8 .  +  max  a  .  )  <_  1  . 

1  l  1  1  1  1 

Condition  (i)  is  necessary  and  sufficient  for  irreducibility ,  and 
(ii)  for  aperiodicity .  The  chain  is  then  easily  shown  to  be  time  reversible. 
Condition  (iii)  insures  that  the  chain  is  stochastically  monotone,  with 
respect  to  the  above  partial  ordering. 

It  then  follows  from  Corollary  4.1  that: 

Pn(x,y)  p  (x,x+l) 

s  (n)  =  max(l  -  - 7— r — )  =  1 - — — p: —  . 

it  (y)  n(x+l) 

To  compute  sx(n),  consider  the  continuous  time  chain  with  Q=I-P. 

The  continuous  time  process  is  composed  of  d  independent  0-1  processes. 
Thus : 


(5.1) 


Pt(x,x+1)  -  [  n 

x.=0  1  1 

1 


a.  -(a.+8.)t 

1  (1-e  1  1  )][  n 


x.-l  “i+6i 
1 


6.  -(a.+6.)t 

1  (l-e  1  1  )] 


=  tt(x+1)  II  (l-e 
1 


d  -(a.+8.)t  n  . 

1  1  )  =  tt(x+1)  l  (-1)  l  e 

k=0  ycA^ 


-s  t 

Y 


where  consists  of  the  (^)  subsets  of  size  k  from  {l,...,d},  and 

s  =  \  (a. +8.),  for  y  a  subset  of  {l,...,d}.  From  the  above  expression 

Y  icy 

for  pt(x,x+l),  and  the  spectral  representation,  pt(x,x+l)  = 

tt(x+1)  +  J1-  1  y  .  (x)u  .  (x+l)e  3  (Keilson  (1979)p.  34),  we  see  that  the 

^ )  J  J 


eigenvalues  of  Q  are  {-s  ycU . >d})  and  thus  the  eigenvalues  of 
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P  =  I-Q  are  {l-s_^  YC^,... 

Pn(X,X+l)  =  TT  (X+1 )  + 

are  the  same  terms  as  in  p 


,d}}.  From  the  spectral  representation,  since 

I  Uj(x)\!j  (x+1)  (1-X^ )n  where  p  (x)  ,  y_.(x+l) 
(x,x+l),  it  follows  from  (5.1)  that: 


(5.2) 


s  (n) 
x 


Pn(x,x+1) 

TT  (X+1) 


d 


l  l  (1-8.  )n  • 

k=l  yeA,  ! 

1C 


In  the  case  l  (a. +6.)  _<  1,  we  recognize  (5.2)  as  an  inclusion-exclusion 

1  1  l 

formula.  Specifically,  consider  n  multinomial  trials  with  cell  probabilities, 

d 

p.=a.+B.,  i=l,...,d,  and  p ,  . .  =  1-T  (a. +3.).  Let  A  be  the  event  that 
ill  d+1  “ii  n 

1  d 

at  least  one  of  the  cells  l,...,d  are  empty.  Then  =  \J  C^,  where 

C.  =  {cell  i  is  empty),  and  (5.2)  represents  the  inclusion-exclusion  formula 
1-  d 

for  Pr(J  C  ) .  Thus  if  we  let  T  denote  the  waiting  time  for  all  of  cells 
1 

l,..,,d  to  each  be  occupied,  then: 


Pr (T  >  n)  =  Pr (A  )  =  s  (n)  . 
n 

By  using  a  construction  of  the  author  (Brown  (1975),  p.  379,  (1984) 
p.  608)  we  can  construct  a  minimal  strong  stationary  time  for  the  continuous 
time  process,  which  when  applied  to  the  embedded  discrete  time  process  is 
indeed  the  waiting  time  for  cells  l,...,d  to  be  occupied.  Adapting  the 
construction  to  discrete  time,  requires  1  (a.  +B,-)  <  1.  When  1  (a.  +8.)  >  1 

ill-  iii 

(with  (i),  (ii),  (iii)  still  holding),  I  have  no  simplifying  explanation  for 
(5.2).  It  is  pleasantly  surprising  that  sx(n)  independent  of  x. 
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1-r 

The  symmetric  walk  a.  =  3.  =  ~~r~  has  been  considered  by  Diaconis 

lid 

(1988),  and  Diaconis  and  Fill  (1989).  In  this  case  (i)  is  satisfied  for 

r  <  1,  (ii)  for  r  >  0,  and  (iii)  for  r  >  — Diaconis  and  Fill  utilize 

—  n+1 

the  symmetry  to  reduce  the  problem  to  a  birth  and  death  process  on 
{0,...,d}.  We  discuss  birth  and  death  processes  in  Section  5.2. 

5.2.  Skip  free  to  the  right  Markov  chains. 

A  Markov  chain  on  {0,...,d}  is  defined  to  be  skip  free  to  the  right 
if  P(i,j)  =  0  for  j-i  _>  2.  An  important  special  case  is  a  birth  and  death 
process,  which  is  skip  free  to  both  the  left  and  right,  i.e.  P(i,j)  =  0 
for  i  j-i  |  _>  1 . 

A  birth  and  death  chain  on  (0,...,d)  is  irreducible  if 
P(i,i+l)P(i+l,i)  >  0  for  i=0,...,d-l.  It  is  aperiodic  if  P(i,i)  >  0  for 
some  i.  An  ergodic  birth  and  death  chain  is  time  reversible.  It  is 
stochastically  monotone  if  and  only  if  P(i,i+l)+P(i+l,i)  1  for  i=0,...,d-l 

An  ergodic  birth  and  death  process  in  continuous  time  is  necessarily  stochasti 
cally  monotone. 

It  follows  from  Corollary  4.1  that  if  a  skip  free  to  the  right  Markov 
chain  is  ergodic  and  P  is  stochastically  monotone,  then  under  =  6^ 
the  chain  is  separable  with  separating  state  d  thus: 

Pn(0,d) 

(5.3)  s0(n)  -  1  -  ' 

In  the  Appendix  we  show  that  for  an  ergodic  skip  free  to  the  right  chain 
on  [0,d],  with  distinct  eigenvalues  1,6^,..., 8^  that; 
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d  '  1-3 

(5.4)  p_  (0,d)  =  71(d)  [1  -  y  Bn(  n  T-ir)]  . 

n  . 3  /  .  p .-p 

3=1  r*3  3  r 


From  (5.3)  and  (5.4)  we  derive: 

Lemma  5.1.  (i)  Let  {X^,n=0, 1, . . . }  be  an  ergodic  skip  free  to  the  right 
Markov  chain  on  [0,...,d],  with  P  processing  distinct  eigenvalues 
1, 3., . . . , 6^.  Assume  that  P  is  stochastically  monotone.  Then: 


s0(n)  = 


8n(  " 


j=l  J  r^j 


1-3 

_ r 

.  3. -6 
'  J  r 


)  . 


In  particular  this  will  hold  for  an  ergodic  stochastically  monotone 
birth  and  death  chain  on  [0, . . . ,d] .  When  3^  >  0,  i=l .... .d.s^n)  =  Pr(T  >  n) , 
where  T  is  the  convolution  of  d  independent  geometric  distributions  with 
parameters  £^,...,3^. 

(ii)  Let  (X(t),t  _>  0}  be  an  ergodic  skip  free  to  the  right  Markov 
chain  on  [0,...,d]  with  Q  processing  distinct  eigenvalues  . . . ,-X^. 

Assume  that  Q  is  stochastically  monotone.  Then: 


-X.t 
J 

In  particular  this  will  hold  for  an  ergodic  birth  and  death  process  on 
[0, . . . ,d] .  When  X  >  0,  j=l,...,n,  s^(t)  =  Pr(T>t),  where  T  is  the 
convolution  of  d  independent  exponential  distributions  with  parameters 
X^,...fXj.  This  is  always  the  case  for  birth  and  death  chains. 


!o(t)  ■  ?  <  ’  xV 

J=1  3  r 


)e 
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Proof .  (i)  The  expression  for  Sg(n)  follows  from  (5.3)  and  (5. A).  It 
holds  for  ergodic  stochastically  monotone  birth  and  death  chains,  because 
such  chains  necessarily  have  distinct  eigenvalues  (Keilson  (1979)p.  59). 
The  convolution  result  follows  from  an  argument,  given  for  the  continuous 
time  case,  in  Brown  and  Shao  (1987),  p.  72. 


(ii)  For  c  sufficiently  large,  the  discrete  time  embedded  chain, 
P=I  +  c  Iq,  will  satisfy  the  conditions  in  (i).  The  eigenvalues  of  P  are 


1,  B .  = 1-c  X  . , i=l, . . . ,d.  Note  that,  (1-6  ) / ( B . -6  )  =  X  /(X  -X.).  Thu;: 

i  i  r  j  j*  r  rj 


,n  -ct  k 


0(t)  -  I  ^ —  I  (  v.  -  I  (  n  ^ 

n=0  •  j=l  r^j  r  j  2  j=l  r^j  r  j 


X  -ct (1-B . ) 

— _)e  J 


k  X  -X .  t 

-  .V  3  • 

3=1  r  j 


The  convolution  interpretation  of  Sg(t)  follows  from  Brown  and  Shao 
(1987)  p.  72.  Finally  an  ergodic  birth  and  death  process  on  [0,...,d]  has 
distinct  eigenvalues  0,-X  , . . . ,-X^  with  X^ >  0,  i=l,...,d  (Keilson  (1979) 
p.  59).  It  thus  satisfies  the  above  conditions,  and  moreover,  the  convolution 
interpretation  of  Sg(t)  holds.  [] 

Diaconis  and  Fill  (19891*  derived  the  above  expression  for  Sp(n)  in 
the  case  of  ergodic  stochastically  monotone  birth  and  death  chains.  Their 
method  was  based  on  a  construction  under  which  s^(n)  a  Hrst  passage 
time  distribution  for  a  dual  birth  and  death  chain.  The  expression  for 
SgCn)  then  follows  from  the  same  Brown-Shao  approach  as  used  here.  Several 
examples  are  presented  in  their  paper  in  which  the  eigenvalues  are  known  and 
Sp(n)  explicitly  computed.  This  includes  the  symmetric  random  walk  on  the 
cube,  where  they  derive  a  special  case  of  (5.2). 
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An  example  of  a  skip  free  to  the  right,  non  birth  and  death  chain, 
satisfying  the  conditions  of  Lemma  5.1  is  now  given: 


P  = 


1  2 

3 

\ 

\ 

o  i 

<  2 

2 

l\ 

5 

5 

5 

5 

5 

1 

2 

2 

.  p  _ 

3 

2 

3 

5 

5 

5 

»  * 

10 

5 

10 

1 

3 

I 

0 

1 

1 

\  8 

8 

2 

2 

2 

Here 


=  -3308,  e2  =  jp3  =  -.0303 


sQ(n)  =  2-8575(.3308)n-1.8575(.0303)n 


In  Section  5.3.2  we  consider  another  partial  ordering  which  applies  to 
birth  and  death  chains. 

5. 3 .  Other  partial  orderings  . 

In  Section  5.1  we  used  chain  distance  partial  ordering,  and  in  5.2 
the  usual  total  ordering.  We  now  discuss  two  partial  orderings.  Undoubtedly 
there  are  many  others  that  can  conveniently  apply  in  specific  cases. 


5.3.1.  Consider  the  following  Markov  transition  matrix,  with  states 

0,1, 2, 3: 


.2 

.3 

.5 

0 

.3 

.4 

.2 

.1 

.5 

.2 

.15 

.15 

0 

.1 

.15 

.75 

P 
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P  is  not  stochastically  monotone  with  respect  to  any  total  ordering 

•k 

on  {0, 1,2,3},  However,  consider  the  partial  ordering  (0,1,2)  <  3,  with 
0,1,2  not  comparable  to  one  another.  A  general  upper  set  consists  of 
{3}uA,  where  A  is  any  subset  (perhaps  empty)  of  {0,1,2}.  P  is  stochas¬ 
tically  monotone,  with  respect  to  this  partial  ordering  if  and  only  if 
P(3,j)  _<  P(i,j)  for  i, j  = 0,1,2.  This  is  satisfied  for  the  above  chain, 

which  is  symmetric,  thus  P  =  P  is  stochastically  monotone  with  respect  to 
* 

<  .  Moreover  P  is  also  doubly  stochastic,  thus  tt  is  uniform.  An 

initial  distribution  satisfies  monotone  decreasing  with  respect 

to  the  above  partial  ordering  if  and  only  if  r^d)  2  ^qO)  for  i=0,l,2. 

It  follows  from  Corollary  4.1  that  for  every  such  tt.  ,  s(n)  =  1— 4tt  (3),  in 

U  n 

particular  s^(n)  = l-4p  (i, 3)  for  i=0,l,2„ 

5.3.2.  Consider  the  following  birth  and  death  chain,  with  states  {0,1, 2, 3}: 


P 


l! 

u 


o 

0 

1 

4 

1 

2 


/ 


P  is  not  stochastically  monotone  with  respect  to  0 <  1 <  2 <  3,  because 
P(1,0)  >  P(0,0) ,  but  otherwise  it  would  be.  To  salvage  stochastic  monotonicity 

•k  * 

define  a  partial  order  by  (0,1)  <  2  <  3,  with  0  and  1  not  comparable. 

k 

Then  stochastic  monotonicity,  with  respect  to  _<  is  easily  shown  to  be 


equivalent  to: 
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P(0,1)  1  P (2, 1) 
P(l,l)  >  P(2,l) 
P (2, 3)  £  P (3 , 3) 


The  above  birth  and  death  chain  satisfies  these  contraints  and  is  thus 

* 

stochastically  monotone  with  respect  to  <_  .  It  follows  that  s^n)  = 

l-4p  (0,3),  to  which  (5.4)  applies.  This  is  the  result  we  would  have 

obtained  if  the  chain  were  monotone  with  respect  to  the  usual  ordering. 

As  a  bonus,  we  also  conclude  that  s. (n)  =  l-4p  (1,3),  which  by 

l  n 

(8.9)  in  the  Appendix  reduces  to: 


s1(n) 


3 


I  (  n 
j*i  r^i 


l-e  e.-p(o,o) 

_ In  (  1  N  o11 

6  -6  M  P(0,1) 


where  are  t*ie  non~zero  eigenvalues  of  P. 

Next,  consider  the  following  birth  and  death  matrix: 


1/2 

1/2 

0 

°\ 

1/2 

1/4 

1/4 

0 

0 

1/4 

1/2 

1/4 

o 

0 

1/2 

1/2 

Note  that  P^  is  stochastically  monotone  relative  to  (0,1)  <  2 < 3, 

*  * 

as  was  P.  It  is  also  stochastically  monotone  relative  to  (2,3)  <  1  <  0, 
since  P(3,2)  >  P(l,2),  P(2,2)  >_  P(l,2)  and  P(1,0)  <  P(0,0).  As  a  result 


we  have: 
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Lemma 

chain 


Then : 


chain 
^01  — 


1-f 


s0(n)  =  s3(n)  =  l  (  n 

j=l  rfj  j  r  J 


i-b  e.-p(o,o) 


si(n)  =  b.-b  /v  p(o,d  'pj 


— )  (—2 
3  '  V  T) 


-)B 


n 


j=l  j 


3  1-B  B.-P(3,3) 

S  <n)  -  l  (  n  (a— ir))(-Lpn-fi)6'? 

2  j-l  riij  6j  8r  p<3>2)  5 


Generalizing  the  above  to  general  d,  we  obtain: 

5.2.  (i)  Consider  a  discrete  time  ergodic  birth  and  death  Markov 

on  {0,...,d},  with  eigenvalues  l,B^,...,Bd,  satisfying: 

PCi.l)  >  P(2,l),  i=0,l 
P(i,i+1)+P(i+1, i)  <  1,  i=2,...,d-l  . 


sQ(n) 


l  (  n 

j=l 


1-B 

r 


B.-B 

J 


r 


)Bn 

3 


(n) 


I  (  n 

j=l  rjtj 


1-B  B.-P(0,0) 

_ L\ (3  \ Rn 

Bj-8rM  P(0,1)  j 


(ii)  Consider  a  continuous  time  ergodic  birth  and  death  Markov 
on  {0,...,d},  with  eignevalues  0,-X^, . . . ,-X^,  and  satisfying 
q21>  Then: 
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Section  6.  Derivation  of  (1.2) . 

First,  consider  a  continuous  time  chain  with  P t ( j  * 3 )  decreasing  in 
t,  for  a  state  j.  Let  W  be  a  random  variable  with: 


(6.1) 


Pr (W  >  t) 


Pt.(j»j)-1T(j) 

l-ir(j) 


Letting  denote  the  Laplace  transform  of  W,  we  have: 


(6.2) 


4>^(s)  =  1-s  j  e  StPr(W>t)dt 


Furthermore,  with  \p .  .  the  Laplace  transform  of  Pt(j»j): 


(6.3) 


e  StPr (W  >  t)dt 


l-ir(j) 


Thus,  from  (6.2)  and  (6.3): 


(6.4) 


s*5j}  =  l-(l-TT(j))^w(s)  . 


Substitute  (6.4)  into  (2.2)  to  obtain: 


(6.5) 


_ J _ 

1-(1-TT  J^(S) 


N 

The  right  side  of  (6.5)  is  the  Laplace  transform  of  Z  W.,  which  proves 

1  1 


(1.2). 
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In  the  discrete  tine  case  with  Pn(j,j)  decreasing,  define  W  to  be 

Pn(j  » j)"1I(j) 

an  integer  valued  random  variable  with  Pr(W>n)  =  - - - r— ; -  .  Then 

1-t:  (j; 

apply  the  above  argument,  using  probability  generating  functions  in  place 

N 

of  Laplace  transforms.  Again,  T  .  ^  I  W.. 

*,J  i  i 


(6.6)  Remark.  For  time  reversible  continuous  time  Markov  chains  with  finite 
state  space,  is  stochastically  monotone  and  thus  decreasing  for 

all  j  (Keilson  (1979)  p.  34).  The  discrete  time  analogue  requires  that  all 
the  eigenvalues  be  non-negative. 

For  ergodic  Markov  chains  in  discrete  or  continuous  time,  taking  values 

*  ~ 

in  a  finite  partially  ordered  set  (S,£),  with  either  P  or  P  stochastically 

•k 

monotone  relative  to  <_,  we  now  argue  that  Pt(j,j)  is  decreasing,  where  j 
is  a  unique  maximal  or  unique  minimal  state. 

Suppose  that  j  is  a  unique  maximal  (minimal)  state  and  that  P  is 
stochastically  monotone.  Then  (j)  is  an  upper  (lower)  set  and 
P t  ( j  » j )  2L  Pt(k,j)  for  all  keS,  and  t.  Then  for  t^  <  t 


Pt  (j.j)  =  [  ?Plr)pr  -  Pr 

2  keS  C2  1  1  C1 


Thus  Pt(j,j)  is  decreasing  in  t. 

If  P  is  stochastically  monotone,  then  by  the  same  argument  Pt(j*j) 
is  decreasing  in  t,  but  P t  ( j  » j )  =  Pt(j>j)>  thus  Pt(j,j)  is  decreasing. 
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Section  7.  Approximate  Exponentiality . 

N 

The  representation  T^  ^  use^u-*-  studying  approximate 

exponentiality. 

First  we  record  a  result  of  the  author  (Brown  (1987))  dealing  with 
geometric  convolutions: 

N 

Lemma  7.1.  Suppose  that  Y  =  I  X^,  where  {X^}  is  an  i.i.d.  sequence  of 
nonnegative  random  variables  and  N  is  independent  of  {X^}  with 
Pr(N=k)  =  qkp ,  k=0,l,....  Then: 


sup  |  Pr  (Y  >  t)-e  t/<EYj  £  p(— 1 -■) 
t  (EX) ^ 


=  2qp, 


where  p^  = 


EY 


-1.  0 


2(EY)‘ 

Applying  Lemma  7.1  and  (1.2)  we  find  that: 


(7.1) 


sup  |  Pr  (T  >  t)-e  3  |  <  tt  ( j )  -EW- =  2(l-ir(j))p  . 

t  ^>3  -  (EW)2 


ET  . 

where  o  =  ET  .  and  p  .  =  — -  1. 

3  "  > J  2a2 

j 

Inequality  (7.1)  tells  us  that  if  the  first  two  moments  of  T^ 
behave  similarly  to  those  of  an  exponential  distribution,  then  T^  is 
approximately  exponential. 

For  time  reversible  continuous  time  chains  T  .  is  completely  monotone 
(Keilson  (1979)  p.  11).  It  follows  from  Brown  (1983)  p.  422  that: 


-a .  t  p 

sup|Pr(T_  > t)-e  J  |  <  - — ^ 


-  p^  .+1 

"♦J 


(7.2) 
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(7.3)  sup|Pr(T  .  >  t)-(l-Tr(j))e 

t  11  >  3 


-t  (1-tt  ( j  )  )  /a  .  p  . 


P  .+1 


Expression  (7.3)  is  used  to  bound  the  error  of  a  modified  exponential 

exproximation  which  takes  into  account  the  atom  of  size  tt ( j )  at  0.  Both 

(7.2)  and  (7.3)  hold  for  first  passage  times  to  arbitrary  (rather  than  just 

singleton)  sets.  The  reason  for  our  restricting  attention  to  singleton  sets 

is  that  (1.2)  will  provide  the  means  of  expressing  the  bound  in  terms  of 

x/a.,  where  x  is  the  relaxation  time.  In  applications  x  is  easier  to 

2 

numerically  approximate,  and  to  bound,  than  is  ET^  .. 

We  now  focus  on  the  moments  of  T  ..  First  recall  (Keilson  (1979) 

11  >  J 

p.  34)  the  spectral  representation  of  transition  probabilities  in  continuous 
time,  time  reversible  chains: 


(7.4) 


m  _  -X  t 

Pt(j,j)  «  xr(j )  +  l  uj\e 
k=l 


where  0  >  -X,  >  -X.  >  •••  >  -X  are  the  eigenvalues  of  the  infinitesimal 
matrix. 

It  follows  from  (7.4)  and  (6.1)  that  W  'v  UE  with  U  and  E  indepen- 

dent,  E  exponential  with  mean  1,  and  Pr(U=j)  =  •-  y,  k=l,...,m.  Thus 

EW  =  EU ,  and  EW2  =  2EU2 .  Thus: 


(7.5) 


a.  =  ET  .  =  EU  =  — y  l  u^.x"1 

J  xr (j )  tt ( j )  L  kj  k 


Var  [E(T  .  | N)  ]  =  Var(NEU)  =  (EU)2 

T,2(j) 


(7.6) 
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(7.7)  E [Var (T^  ^  | N) ]  =  E[NVarW]  =  [2EU2-(EU)2]  . 


From  (7.5)— (7,7) : 


(7.8) 


VarT  .  =  a2+-~r  l  p2  A  2  <  a2  +  2xa . 
71  >3  j  71(3)  L  kj  k  -  3  3 


where  x  =  max(A  "'"),  the  relaxation  time.  From  (7.8) 


(7.9) 


P  .  2a2  x/a. 

^r=  1  (T/aj>+1  * 

TT.3 


We  now  summarize: 

Theorem  7.1.  Let  {X(t),t>.0}  be  a  continuous  time,  time  reversible  Markov 
chain,  with  finite  state  space.  Then: 


(i) 


-t/a.  T/a 

sup  Pr(T  .  >  t)-e  J  ■ 

t  tt>3  -  (T/ot  )+l 


(ii)  sup  I  Pr  (T  .  >  t )-(1-tt(  j)  )e 

t  "  >  J 


-(l-ir(j))t/ai  T/°‘i 

Jl  iTv7rfe-"«>.  0 


The  difficulty  in  dealing  with  discrete  time,  time  reversible  chains, 

is  that  due  to  possibly  negative  eigenvalues  T  .  need  not  be  a  mixture 

11  »  3 

of  geometric  distributions.  Thus  (7.2)  and  (7.3)  are  not  applicable. 


Nevertheless,  if  Pn(j,j)  is  decreasing  then  (7.1)  applies.  Moreover,  as 
follows  from  an  argument  of  Aldous  ((1989)  p.  183): 


(discrete)  (continuous)  , -1 
P_  *  =  (>  •  ~(2a  ) 

71 » 3  71 1 3  j 


( 7  ION 
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Combining  (7.1),  (7.9)  and  (7.10)  we  easily  derive: 

Corollary  7.1.  Let  {Xn,n=0, 1 , . . . }  be  a  time  reversible  Markov  chain  with 
Pn(j»j)  decreasing.  Then: 


,  t  / ot  -  2t-1 

sup  j  Pr  (T^  j  >  t)-e  J  j  <_  (1-tt  ( j  ) )  (—■■  •)  . 


where  1  >  3.  _>  2.  •  •  •  ^  are  t^ie  eigenvalues  of  P,  and  x  =  (1-8^)  \  Q 

In  the  case  of  a  discrete  time,  time  reversible  Markov  chain,  with  all 
nonnegative  eigenvalues,  the  methodology  of  Brown  (1983),  p.  422,  can  be 
applied  to  derive: 


(7.11) 


suplPrCT 
n  ’ J 


a .  n 

>n)'(S^r)  I 

3 


<  - 

(— )+l 

a . 

3 


Finally  we  remark  that  it  follows  from  an  argument  of  Brown  (1987) 
p.  15,  and  (1.1),  that  if  ^(j)  is  increasing  then: 


-t/ET„  . 


-t/a . 


(7.12)  sup |Pr(T  .  >  t)-e  0’j|  <  —  +  sup |Pr(T  .  >  t)-e  ^ |  . 

t  VJ  ai  t  710 


Thus,  when  EY  is  small  compared  to  ou ,  approximate  exponentiality 

of  T  .  also  yields  approximate  exponentiality  of  T 

^ » 3  3 

Example.  Consider  a  continuous  time  random  walk  on  the  cube,  {0,1}^,  with: 


q(x,x+5.)  =  u  .  >  0,  x.  =  0 

^  ’  l  i  ’  l 

q(x,x-6i)  =  ni >  0,  = 1 


The  key  quantities  are: 


35 


(7.13) 


x  =  [min(y i+ni) ] 


-1 


(7.14) 


_  -i 

«.  =  y  s  (  n  — )(  n  — ) 

J  yh  Y  YB,  YB.  Pi 

J  3 


where  y  ranges  through  non-empty  subsets  of  {l,...,d},  s  =  Z (p . +n - )  ■ 

Y  y  1  1 

and  Bj  =  {i:j^=0},  the  zero  components  of  the  vector  j. 


(7.15) 


e*  i  -  l  s'2(  n  V  n  \  .  2 

,J  yH  Y  y*.  ni)(YBj  ui)/  j 


(7.16) 


EV  -  I  (-l)"'1  l 


-1 


k=l 


where  is  the  collection  of  subsets  of  size  k  from  {l,...,d}. 

Now,  we  simplify  by  letting  =  rp  =  c  >  0,  obtaining  the  symmetric 

walk.  In  Table  1  below  we  consider  the  cases  d=10  and  d=20. 

In  column  1  of  Table  1  we  compute  the  error  bound  (7.2)  for  exponential 

ET2  . 

7T  T 

approximation  of  T  ..  In  column  2,  we  replace  p  =  ( - - 1)  by 


".J 


2a 


j 


its  upper  bound  t/a^,  obtaining  the  approximation  (i)  of  Theorem  (7.1). 

We  see  that  the  exponential  approximation  is  quite  accurate,  and  the  relaxation 

time  simplification  gives  a  usable  though  quite  conservative  upper  bound. 

In  column  3  we  compute  the  refined  error  bound,  (7.3),  which  adjusts 

the  approximating  distribution  to  mimic  the  known  probability,  Pr(T  .=0). 

^  >  3 

In  column  4  the  p  based  bound  is  replaced  by  the  corresponding  relaxation 
time  expression  (Theorem  7.1,  (ii)).  It  is  seen  that  this  approximation  is 
remarkably  accurate.  Much  is  lost  in  using  the  relaxation  time  upper  bound, 
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with  the  error  bound  over  100  times  too  large  with  d=20.  However,  even 
the  conservative  bound  is  quite  small. 

In  column  5  we  table  the  error  bound  in  exponential  approximation  of 
T  ,  by  use  of  (7.2)  and  the  column  1  error  bound  for  T  ..  (note  that 

X  9  X  '  -L  TT  ^  X  •  -L 

in  the  symmetric  case  the  error  bound  for  is  independent  of  j).  In 

column  6, 
used  with 
little  is 


11 

we  look  at  the  analogous  quantity  when  the  column  2  error  bound  is 
(7.12).  Here,  since  EY/a  is  the  dominant  contribution  to  error, 
lost  in  the  relaxation  time  simplification. 


Table 
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Section  8.  Comments  and  Additions, 


1)  Corollary  A.l  justifies  the  heuristic  interpretation  of  (1.1) 
given  in  Section  1,  under  appropriate  conditions.  In  general,  it  (j) 

7,1  (j) 

increasing  does  not  imply  that  s(t)  =  1  -  ~  .y  ■ .  The  following  example, 
which  illustrates  this  point  was  suggested  to  me  by  David  Aldous. 

Consider  a  birth  and  death  process  on  {0,1,2}  with  ^Ql  =  q21  = 


qi0  =  qi2 =  1*  Then: 


pt(0,l)  =  \  (l-e"4t) 


which  is  increasing.  But  ^om  Lemma  5.1: 


s(t)  =  1  - 


Pt(0,2) 

tt(2) 


-2t 


-At 2t 


=  l-(l-e  =  e  (2e  -1)  . 


I  have  found  that  in  some  cases  (1.1)  can  be  explained  in  the  following 

way.  There  exists  a  distribution  tt’  such  that  T  ,  .  'v  T  .,  and  a  stopping 

*  *3  Tf.J 

time  Y  with  X(Y)  'v  X(Y)  independent  of  Y,  Pr(X(t)=j  for  some  t  <  Y)  =  0, 


and  Pr  (Y  >  t)  =  1  - ■ 


(j) 


time  from 


0 


to  tt' 


.  If  this  holds  we  can  interpret  Y  as  the  waiting 
I  do  not  know  for  how  wide  a  class  of  situations  this 


interpretation  applies. 


2)  From  (A.l) : 


(8.1)  d(t)  1  (1-  P^(tt(X)  <_  TTt(X)))s(t)  . 


Suppose  that  the  conditions  of  Theorem  3.2  hold  and  in  addition 
tt o  =  6^,  where  i  is  the  unique  minimal  state  (i_<k  for  all  keS).  Then  by 
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Remark  (6.6),  p^(i,i)  is  decreasing,  and  thus  pt(i,i)  _>  7t ( i )  for  all  t. 
Thus : 


(8.2)  P  (tt  (X)  <  tt.  (X))  >  tt  (i) 

IT  —  L  — 

and  from  (8.1)  and  (8.2): 

P.(i,M) 

d(t)  <  (l-7T(i))(l  -  . 

—  ir(M) 

This  inequality  applies  to  ergodic  stochastically  monotone  birth  and 
death  chains  on  {0,.,.,d}  with  i=0  and  M=d.  It  also  applies  to  the 
non-symmetric  walk  on  the  cube  discussed  in  Section  (5.1).  Here: 

dx(n)  1  (l-^(x))sx(n) 

where  sx(n)  is  given  by  (5.2),  and: 

d  -1 

7t(x)  =  [n  (a  +8.)]  x(  n  B  )(  n  a.)  . 

1  1  x .=0  X  =1 

i  i 
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Appendix. 

We  will  derive  the  expressions  for  transition  probabilities  that  were 
used  in  Sections  5.2  and  5.3.  These  are  special  cases  of  more  general 
results  1  plan  to  present  in  a  forthcoming  paper.  The  theme,  began  in 
Brown-Shao  (1987),  is  that  one  can  develop  expressions  for  transition 
probabilities  and  first  passage  time  distributions,  which  depend  on  eigen¬ 
values,  but  not  explicitly  on  eigenvectors.  Nor  are  eigenvectors  needed  to 
dervive  these  expressions. 

Suppose  that  P,  an  mxm  matrix,  is  diagonalizable,  so  that  P =  ADA  \ 

where  D  is  diagonal  with  diagonal  entries  d^,...,dm,  which  are  necessarily 

the  eigenvalues  of  P.  Since  Pn  =  ADnA  ^  for  all  n  =  0,l,...,  it  follows 

that  f(P)  =  Af (D) A  1  for  polynomials,  f,  where  f(D)  is  a  diagonal  matrix 

with  diagonal  entires  f (d^) , . . . , f (d^) .  Let  8^,..., 8^  denote  the 

1  k  m  distinct  eigenvalues  of  P.  It  follows  that  for  polynomials, 

f  and  g,  f(P)  =  g(P)  if  and  only  if  f(8^)  =  g(B^),  i=l,...,k.  Now,  let 

k  x-Br 

f(x)  =  x  and  g(x)  Z  [  II  (- — — )  ]  8 ,  .  The  polynomial  g(x)  is  the  well 

i=l  r*i  Bi_6r  1 

known  Lagrange  interpolation  polynomial  (Birkhoff  and  Rota  (1969),  p.  215). 

It  is  a  polynomial  of  degree  k-1,  with  g(B^)  =  f(8^)  -  8?,  i=l,...,k.  It 
follows  that  g(P)  =  f(P),  thus: 


(8.1) 


k 


l  <  n 

i=l  r^i 


P-8  I 
r 

BrBr 


This  result,  (8.1),  is  discussed  in  Gantmacher  (1960)  p.  101,  and  in 
Dunford  and  Schwartz  (1958)  p.  562. 
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Assume  that  we  have  a  skip  free  to  the  right  Markov  chain  on  {0,1,..., d} 
with  d+1  distinct  eigenvalues  1,3^,..., 6^,  and  transition  matrix  P. 
Distinctness  of  the  eigenvalues  implies  that  P  is  diagonalizable,  and  thus 

that  (8.1)  holds.  Noting  that  pn(0,d)  =  0,  n=0,...,d-l,  and 

d-1  41 

p^(0,d)  =  .II  P(i,i+1),  we  have  from  (8.1): 


(8.2) 


Pn<M>  -  Pjfo.dH^  - 


-  I  <  J. 

i=l  r^i 


i  r  i 


Letting  n 03  in  (8.2),  we  see  that  if  the  Markov  chain  is  ergodic  then 
k 

p^(0,d)  =  Tr(d)  It  (1-6^),  thus: 
i=l 


(8.3) 


(8.4) 


d  1-6 

p  (0,d)  =  rr(d)[l-  l  (  n  y^)6"] 

n  i=i  rH  1_ei  1 


P  (0,d) 


1  - 


71  (d) 


d  n 
-  I  (  n  tztK  * 

i=l  r^i  1_ei  1 


Next,  consider  p  (l,d),  noting  that  p  (l,d)=0,  n=0,...,d-2, 

-1  -1  d 
Pd-l(1’d)  =  (P(0,1))  Pd(0’d)’  and  Pd(l,d)  =  (P(0,1))  P(i,i))pd(0,d) 

d  d  1 

Furthermore  X  P(i,i)  =  tr(P)-P(0,0)  =  Z  g.+P(0,l).  Thus: 

1  1  1 


(8.5) 


Pd_1(i,d)  =  7T(d) (P(0,1))-1  n  (l-^i) 


(8.6) 


p,(l,d)  =  Tr(d)  (P(0, 1)  )“1[^  6,+P(0,l)]  n  (1-Bi) 

1  1 


From  (8.1): 
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6? 


pn(i,d)  -  pd ( i »d> In  -  l  y.  -i-1 

i  iX'  .k  r  i 


(8.7) 


pd-l(1,d)C'3' 


d 

£  8 

1  i 


n  (i-8.) 

l  1 


k  (Z  8 . )+(l-8 . ) 

y  r_i — i — - - 

A  3  (8,-er) 

i-l  r^-L  i  r 


6n 


1-8 


Substituting  (8.5)  and  (8.6)  into  (8.7)  and  collecting  terms  we  obtain: 


(8.8) 

(8.9) 


P  (1,8)  =  ir(d)  [1  -  l  (  11 


1-8. 


itirk  erer 


B.-P(0,0) 

(  P(0,1)  )Bi-* 


1 


Pn(l,d)  d  1-6  6 .-P(0,0) 

'""(d)  "J/rJi  P(0’1)  )6i  * 


Similar  expressions  can  be  derived  for  Pn(j,d),  j=2,...,d-l. 


4  .  * 
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